NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Is Large Language Model Performance on Reasoning Tasks Impacted by Different Ways Questions Are Asked?

Song, Seok_Hwan; Chakraborty, Mohna; Li, Qi; Tavanapong, Wallapak (July 2025, Findings of the Association for Computational Linguistics ACL 2025)

Free, publicly-accessible full text available July 27, 2026
How Much Do Prompting Methods Help LLMs on Quantitative Reasoning with Irrelevant Information?

https://doi.org/10.1145/3627673.3679840

Song, Seok Hwan; Tavanapong, Wallapak (October 2024, ACM)

Real-world quantitative reasoning problems are complex, often including extra information irrelevant to the question (or “IR noise” for short). State-of-the-art (SOTA) prompting methods have increased the Large Language Model’s ability for quantitative rea-soning on grade-school Math Word Problems (MWPs). To assess how well these SOTA methods handle IR noise, we constructed four new datasets with IR noise, each consisting of 300 problems from each of the four public datasets: MAWPS, ASDiv, SVAMP, and GSM8K, with added IR noise. We called the collection of these new datasets “MPN”—Math Word Problems with IR Noise. We evaluated SOTA prompting methods using MPN. We propose Noise Reduction Prompting (NRP) and its variant (NRP+) to reduce the impact of IR noise. Findings: Our IR noise significantly degrades the performance of Chain-of-Thought (CoT) Prompting on three different backend models: ChatGPT (gpt-3.5-turbo-0613), PaLM2, and Llama3-8B-instruct. Among them, ChatGPT offers the best accuracy on MPN with and without IR noise. With IR noise, the performances of CoT, Least-To-Most Prompting, Progressive-Hint Prompting, and Program-aided Language Models with ChatGPT were significantly impacted, each with an average accuracy drop of above 12%. NRP is least impacted by the noise, with a drop in average accuracy to only around 1.9%. Our NRP+ and NRP perform comparably in the presence of IR noise.
more » « less
Full Text Available
IDCIA: Immunocytochemistry Dataset for Cellular Image Analysis

https://doi.org/10.1145/3587819.3592558

Mohammed, Abdurahman Ali; Fonder, Catherine; Sakaguchi, Donald S.; Tavanapong, Wallapak; Mallapragada, Surya K.; Idris, Azeez (June 2023, MMSys '23: Proceedings of the 14th Conference on ACM Multimedia Systems)

We present a new annotated microscopic cellular image dataset to improve the effectiveness of machine learning methods for cellular image analysis. Cell counting is an important step in cell analysis. Typically, domain experts manually count cells in a microscopic image. Automated cell counting can potentially eliminate this tedious, time-consuming process. However, a good, labeled dataset is required for training an accurate machine learning model. Our dataset includes microscopic images of cells, and for each image, the cell count and the location of individual cells. The data were collected as part of an ongoing study investigating the potential of electrical stimulation to modulate stem cell differentiation and possible applications for neural repair. Compared to existing publicly available datasets, our dataset has more images of cells stained with more variety of antibodies (protein components of immune responses against invaders) typically used for cell analysis. The experimental results on this dataset indicate that none of the five existing models under this study are able to achieve sufficiently accurate count to replace the manual methods. The dataset is available at https://figshare.com/articles/dataset/Dataset/21970604.
more » « less
Full Text Available
Identifying Policy Agenda Sub-Topics in Political Tweets based on Community Detection

https://doi.org/10.1145/3110025.3116208

Iyer, Rohit; Wong, Johnny; Tavanapong, Wallapak; Peterson, David A. (July 2017, ASONAM '17: Proceedings of the 2017 IEEE/ACM International CConference on Advances in Social Networks Analysis and Mining 2017)
null (Ed.)
Full Text Available
Social Media in State Politics: Mining Policy Agendas Topics

https://doi.org/10.1145/3110025.3110097

Qi, Lei; Li, Rihui; Wong, Johnny; Tavanapong, Wallapak; Peterson, David A. (July 2017, ASONAM '17: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017)
null (Ed.)
Full Text Available

Search for: All records